NAVAL 

POSTGRADUATE 

SCHOOL 

MONTEREY,  CALIFORNIA 


THESIS 


A  CONTRASTING  LOOK  AT  NETWORK  FORMATION 
MODELS  AND  THEIR  APPLICATION  TO  THE  MINIMUM 

SPANNING  TREE 

by 

Deanne  B.  McPherson 

September  2009 

Thesis  Advisor: 

David  L.  Alderson 

Second  Reader: 

Timothy  H.  Chung 

Approved  for  public  release;  distribution  is  unlimited 


j  REPORT  DOCUMENTATION  PAGE 

Form  Approved  OMB  No.  0704-0188  | 

Public  reporting  burden  for  this  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instruction, 
searching  existing  data  sources,  gathering  and  maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send 
comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information,  including  suggestions  for  reducing  this  burden,  to 
Washington  headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington,  VA 
22202-4302,  and  to  the  Office  of  Management  and  Budget,  Paperwork  Reduction  Project  (0704-0188)  Washington  DC  20503. 

1.  AGENCY  USE  ONLY  (Leave  blank) 

2.  REPORT  DATE 

September  2009 

3.  REPORT  TYPE  AND  DATES  COVERED 

Master’s  Thesis 

4.  TITLE  AND  SUBTITLE  A  Contrasting  Look  at  Network  Fonnation  Models  and 
Their  Application  to  the  Minimum  Spanning  Tree 

5.  FUNDING  NUMBERS 

6.  AUTHOR(S)  Deanne  B.  McPherson 

_ 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Naval  Postgraduate  School 

Monterey,  CA  93943-5000 

8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 

9.  SPONSORING  /MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

N/A 

10.  SPONSORING/MONITORING 
AGENCY  REPORT  NUMBER 

11.  SUPPLEMENTARY  NOTES  The  views  expressed  in  this  thesis  are 
or  position  of  the  Department  of  Defense  or  the  U.S.  Government. 

:  those  of  the  author  and  do  not  reflect  the  official  policy 

12a.  DISTRIBUTION  /  AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  is  unlimited 

12b.  DISTRIBUTION  CODE 

A 

1  13.  ABSTRACT  (maximum  200  words) 

Networks  are  prevalent  in  man-made  and  natural  systems  throughout  the  world.  Despite  recent  efforts  to  characterize 
and  catalog  networks  of  all  kinds,  there  is  considerably  less  known  about  the  forces  that  drive  network  formation.  For 
many  complex  systems,  it  is  unclear  whether  networks  are  the  result  of  an  explicit  effort  to  achieve  some  overarching 
global  system  objective,  or  if  network  structure  is  just  a  byproduct  of  local,  selfish  decisions.  In  this  thesis,  we  review 
network  formation  models  and  conduct  numerical  experiments  to  contrast  their  behavior  and  the  structural  features  of 
the  networks  they  generate.  We  focus  primarily  on  problems  related  to  the  formation  of  minimum  spanning  trees  and 
consider  the  cost  of  selfish  behavior,  more  commonly  known  as  the  price  of  anarchy,  in  network  formation.  We  also 
explore  differences  between  local,  decentralized  methods  for  network  formation  and  their  global,  centralized 
counterparts. 

14.  SUBJECT  TERMS  Network  formation,  graph  generation,  minimum  spanning  tree,  price  of 
anarchy 

15.  NUMBER  OF 

PAGES 

67 

16.  PRICE  CODE 

17.  SECURITY 
CLASSIFICATION  OF 
REPORT 

Unclassified 

18.  SECURITY 

CLASSIFICATION  OF  THIS 
PAGE 

Unclassified 

19.  SECURITY 
CLASSIFICATION  OF 
ABSTRACT 

Unclassified 

20.  LIMITATION  OF 
ABSTRACT 

UU 

NSN  7540-01-280-5500  Standard  Form  298  (Rev.  2-89) 


Prescribed  by  ANSI  Std.  239-18 


THIS  PAGE  INTENTIONALLY  LEFT  BLANK 


11 


Approved  for  public  release;  distribution  is  unlimited 


A  CONTRASTING  LOOK  AT  NETWORK  FORMATION  MODELS  AND  THEIR 
APPLICATION  TO  THE  MINIMUM  SPANNING  TREE 

Deanne  B.  McPherson 
Lieutenant,  United  States  Navy 
B.S.,  University  of  Scranton,  2001 

Submitted  in  partial  fulfillment  of  the 
requirements  for  the  degree  of 

MASTER  OF  SCIENCE  IN  OPERATIONS  RESEARCH 


from  the 


NAVAL  POSTGRADUATE  SCHOOL 
September  2009 


Author:  Deanne  B.  McPherson 


Approved  by:  David  L.  Alderson 

Thesis  Advisor 


Timothy  H.  Chung 
Second  Reader 


Robert  F.  Dell 

Chairman,  Department  of  Operations  Research 


iii 


THIS  PAGE  INTENTIONALLY  LEFT  BLANK 


IV 


ABSTRACT 


Networks  are  prevalent  in  man-made  and  natural  systems  throughout  the  world. 
Despite  recent  efforts  to  characterize  and  catalog  networks  of  all  kinds,  considerably  less 
is  known  about  the  forces  that  drive  network  formation.  For  many  complex  systems,  it  is 
unclear  whether  networks  are  the  result  of  an  explicit  effort  to  achieve  some  overarching 
global  system  objective,  or  if  network  structure  is  just  a  byproduct  of  local,  selfish 
decisions.  In  this  thesis,  we  review  network  formation  models  and  conduct  numerical 
experiments  to  contrast  their  behavior  and  the  structural  features  of  the  networks  they 
generate.  We  focus  primarily  on  problems  related  to  the  fonnation  of  minimum  spanning 
trees  and  consider  the  cost  of  selfish  behavior,  more  commonly  known  as  the  price  of 
anarchy,  in  network  formation.  We  also  explore  differences  between  local,  decentralized 
methods  for  network  fonnation  and  their  global,  centralized  counterparts. 
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EXECUTIVE  SUMMARY 


Networks  are  prevalent  in  man-made  and  natural  systems  throughout  the  world. 
Despite  recent  efforts  to  characterize  and  catalog  networks  of  all  kinds,  considerably  less 
is  known  about  the  forces  that  drive  network  formation.  For  many  complex  systems  it  is 
unclear  whether  networks  are  the  result  of  an  explicit  effort  to  achieve  some  overarching 
global  system  objective,  or  if  network  structure  is  just  a  byproduct  of  local,  selfish 
decisions.  We  conduct  numerical  experiments  using  network  formation  models  to 
examine  the  behavior  and  structural  properties  of  the  networks  they  form. 

We  discuss  several  models  of  network  formation,  including  the  Erdos  and  Renyi 
(1959)  random  graph  model,  the  random  geometric  graph  model,  and  a  preferential 
attachment  model,  popularized  by  Barabasi  and  Albert  (1999),  which  produces  a  network 
with  a  node  degree  distribution  that  can  be  described  by  a  power-law.  We  then  review  an 
optimization-based  model,  from  which  we  derive  our  model  for  our  numerical 
experiment  as  well  as  game  theoretic  models. 

We  provide  a  review  of  the  minimum  spanning  tree  (MST)  problem.  We 
introduce  it  as  a  fonnal  optimization  problem,  which  is  non-trivial  to  solve  as  an  integer 
linear  program  for  large  problems.  We  then  review  two  centralized  algorithms,  Kruskal’s 
(1956)  and  Prim’s  (1957),  which  take  advantage  of  the  special  network  structure  in  order 
to  more  easily  solve  the  MST  problem.  In  contrast  to  the  global  algorithms,  we  review 
the  decentralized  algorithm  of  Gallagher,  Humbolt  and  Spira  (1983)  that  utilizes 
“message  passing”  between  nodes  to  solve  for  the  MST. 

Our  numerical  experiments  use  a  simplified  version  of  the  optimization-based 
model,  which  grows  networks  by  adding  nodes  to  the  unit  square  one  at  a  time.  Each 
new  node  fonns  an  arc  in  the  network  to  the  node  that  minimizes  the  arc’s  Euclidian 
distance.  By  restricting  the  objective  function  to  only  distance,  we  determine  the  price  of 
anarchy  by  comparing  the  total  network  cost  to  the  optimal  cost  of  the  MST.  The  results, 
based  on  10,000  trials,  indicate  the  cost  of  a  network  formed  with  100  nodes  is 


xv 


approximately  50%  greater  than  the  MST.  By  altering  the  arrival  order  of  the  nodes  to 
the  network,  we  observe  that  precedence  plays  an  important  role  in  these  network  growth 
heuristics. 

We  also  consider  network  rewiring  experiments,  in  which  we  allow  the  nodes  to 
reevaluate  their  initial  arcs  to  see  if  they  can  improve  their  objective  function.  They 
continue  this  process  until  no  node  can  improve,  and  the  network  is  in  equilibrium.  After 
10,000  trials,  the  cost  of  this  100  node  network  is  approximately  15%  greater  than  the 
MST.  We  also  alter  the  order  in  which  the  nodes  reevaluate  their  arcs,  and  detennine  that 
precedence  in  the  rewiring  also  affects  the  final  network  structure. 

We  conclude  with  suggestions  for  continued  research  to  detennine  if  there  might 
be  an  interpretation  of  the  local,  myopic  decision  process  we  utilized  that  lends  itself  to 
an  equivalent  global  solution.  We  appeal  to  the  example  of  the  Internet,  in  which  duality 
theory  has  helped  to  understand  the  behavior  of  the  complex  network. 
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I.  INTRODUCTION 


A  graph  is  a  collection  of  nodes  (also  called  vertices )  connected  by  a  set  of  arcs 
(also  called  edges).  A  network  is  a  specific  type  of  graph,  where  associated  with  each  arc 
or  node  is  additional  information,  such  as  the  cost  or  capacity  of  the  arc  or  the  demand  at 
a  node.  Networks  are  integral  to  a  variety  of  systems  that  we  rely  upon  each  day.  Our 
transportation  system  is  made  up  of  a  variety  of  networks  including  road,  rail  and  airline 
networks.  Our  electrical  system  is  a  network  of  wires  that  ensures  power  reaches  homes 
and  businesses.  Communications  systems,  including  the  Internet,  are  expanding  beyond 
the  typical  hard  wired  lines  to  include  wireless  networks.  Even  individuals’  relationships 
with  one  another  can  be  viewed  as  a  network  of  social  ties. 

Each  of  these  networks  plays  an  important  role  in  society.  A  transportation 
network  provides  a  means  for  goods  and  people  to  move  from  a  starting  location  to  a 
destination.  The  electrical  system  continuously  balances  generation  with  fluctuating  user 
demand.  Communication  networks  and  the  Internet  provide  a  massive  increase  in  the 
amount  of  easily  obtainable  information,  and  they  also  dramatically  decrease  the  amount 
of  time  required  to  transfer  information  around  the  world.  The  study  of  social  networks 
is  increasingly  popular,  with  sites  such  as  Facebook  and  Twitter  capturing  evolving 
relationships  between  millions  of  people.  The  analysis  of  networks  is  even  helping  to 
fight  terrorism  by  identifying  terrorist  networks  so  that  we  can  determine  where  it  is  most 
effective  to  disrupt  them. 

Because  of  the  prevalence  and  importance  of  networks  throughout  our  world,  the 
last  decade  has  seen  increased  scientific  attention  on  the  properties  and  functions  of 
networks.  Much  of  the  recent  effort  has  been  to  catalog  a  diversity  of  networks  and  to 
characterize  their  structural  features.  The  majority  of  this  work  on  network  structure  has 
emphasized  the  connectivity  properties  of  the  underlying  graph,  thus  renewing  interest  in 
graph  theory. 

The  study  of  networks  is  actually  centuries  old.  Graph  theory  dates  back  to 
Leonard  Euler  in  1736  (Biggs,  Lloyd  and  Wilson,  1998),  when  he  proved  there  was  no 
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feasible  solution  to  the  Konigsberg  Bridge  Problem.  The  development  of  random  graph 
theory  in  the  1940s  and  1950s  generated  great  interest  in  the  characteristics  of  graphs  and 
networks  (see  Newman,  Barabasi  and  Watts,  2006,  for  notable  publications).  Most 
recently,  the  advent  of  “network  science”  during  the  last  decade  has  witnessed  renewed 
interest  in  the  large-scale  properties  of  graphs  (National  Research  Council,  2005). 

Despite  considerable  effort  devoted  to  the  “what”  of  these  networks,  considerably 
less  is  known  about  the  “how”  and  the  “why.”  It  is  not  always  clear  what  drives  the 
formation  of  networks.  A  fundamental  question  is  whether  networks  form  to  achieve 
some  overarching  global  objective  or  if  network  structure  is  just  a  byproduct  of  local, 
selfish  decisions.  Or  it  may  be  a  combination  of  the  two.  It  is  also  not  clear  how  global, 
centralized  network  formation  versus  local,  decentralized  formation  affects  the  properties 
of  the  resulting  network. 

Understanding  the  forces  that  drive  network  formation  is  becoming  increasingly 
important.  Of  particular  interest  is  the  Internet.  This  is  an  extremely  complex  network 
that  has  managed  to  evolve  and  grow  at  an  amazing  pace.  To  some  researchers,  the 
Internet  exemplifies  a  system  that  has  self-organized.  They  argue  that  the  network  was 
not  built  by  a  “central”  designer,  but  arose  rather  as  a  result  of  the  localized  actions  of  the 
users  and  service  providers.  In  spite  of  its  ad-hoc  construction,  the  Internet  is  still 
relatively  robust  (Willinger  and  Doyle,  2004).  The  ability  to  model  such  a  complex 
network  and  to  understand  its  underlying  properties  is  extremely  relevant  for  the  study  of 
both  man-made  and  natural  systems. 

Another  area  of  increasing  importance  is  the  use  of  Hastily  Fonned  Networks 
(HFNs)  in  response  to  humanitarian  aid  and  disaster  relief  operations,  such  as  a  Hurricane 
Katrina  scenario  (Denning,  2006).  These  types  of  networks  require  rapid  coordination 
and  infonnation  between  a  variety  of  agencies. 

Research  in  the  “how”  of  network  formation  has  ranged  from  random  graph 
generation  to  system  design.  Erdos  and  Renyi  (1959)  pioneered  the  exploration  of 
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random  graphs  models,  which  generated  interest  in  graph  and  network  theory.  More 
recently,  the  study  of  network  science  has  focused  attention  on  “small-world  networks” 
and  “scale-free  networks.” 

Small-world  networks  (Watts  and  Strogatz,  1998)  are  networks  that  have  high 
local  clustering  and  have  path  lengths  between  arbitrarily  chosen  nodes  that  are  still 
relatively  short.  Small-world  networks  have  been  used  to  explain  the  “six  degrees  of 
separation”  phenomena. 

Scale-free  networks  (Barabasi  and  Albert,  1999)  have  been  used  to  describe  any 
number  of  complex,  large  real  world  networks  whose  nodes  degree  distributions  tend  to 
follow  a  power  law.  This  observation  of  a  power  law  can  be  seen  in  the  World-Wide 
Web  (WWW),  biological  sciences,  and  social  networks. 

In  this  thesis,  we  review  some  of  the  recent  models  used  for  network  formation 
and  conduct  numerical  experiments  to  compare  and  contrast  the  structural  features  of  the 
graphs  they  generate.  We  construct  all  the  algorithms  we  discuss  as  well  as  the  numerical 
experiments  in  this  thesis  from  scratch  using  the  Java  programming  language.  We  focus 
primarily  on  problems  related  to  the  formation  of  minimum  spanning  trees  and  consider 
the  cost  of  selfish  behavior,  more  commonly  known  as  the  price  of  anarchy,  in  network 
formation.  We  then  contrast  some  of  the  local,  decentralized  methods  for  network 
formation  to  the  global,  centralized  methods.  Our  results  help  to  clarify  the  tensions  in 
network  fonnation  problems  for  both  man-made  and  natural  systems. 

The  remainder  of  this  thesis  proceeds  as  follows.  Chapter  II  reviews  previous 
research  in  network  fonnation.  Chapter  III  introduces  the  minimum  spanning  tree 
problem  along  with  both  centralized  and  decentralized  algorithms  for  solving  it.  Chapter 
IV  presents  experiments  that  help  to  elucidate  the  underlying  mechanism  behind  network 
formation  in  relation  to  the  minimum  spanning  tree.  We  conclude  in  Chapter  V  with  a 
brief  summary  of  our  findings,  as  well  as  provide  guidance  for  future  research  in  this 
area. 
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II.  NETWORK  FORMATION  MODELS 


In  this  chapter,  we  review  several  models  of  network  formation.  Throughout  this 
thesis,  we  adopt  the  following  notation  and  definitions  from  Ahuja,  Magnanti  and  Orlin 
(1993). 

A  graph,  or  network,  G  =  ( N ,  A )  consists  of  a  set  N  nodes  and  a  set  of  A  arcs.  The 
number  of  nodes  is  n  =  |/V|  and  the  number  of  arcs  is  m  =  \A\.  An  arc  from  node  i  to  node 
j  is  denoted  as  (/',/')  where  i,j  e  N .  If  G  is  a  directed  graph  then  (/,  /)  ^  (/,/) ,  but  if  G 
is  an  undirected  graph  then  (/,  /)  =  (/,/) . 


A  subgraph  of  G  =  ( N ,  A)  is  a  graph  G'  =  ( N A')  if  iV'cff  and  A'  c:  A.  It  is  a 
spanning  subgraph  of  G  =  ( N ,  A)  if  N'  =  N .  A  tree  is  a  connected  graph  that  contains 
no  cycles.  A  subtree  is  a  connected  subgraph  of  a  tree.  A  spanning  tree  of  G  is  a  tree 
that  is  a  spanning  subgraph  of  G  and  has  exactly  n- 1  arcs.  A  minimum  spanning  tree 
( MSI )  is  a  spanning  tree  of  minimum  cost. 


A  cut  is  a  partition  of  node  set  N  into  two  parts,  K  and  K  =  N  -  K  .  A  cut  defines 
the  set  of  arcs  that  have  one  endpoint  in  K  and  the  other  in  K . 

The  degree  of  node  i,  degi,  is  the  total  number  of  incident  arcs  to  node  i.  The  cost 
of  arc  (z,  /)  is  denoted  c,j.  In  some  of  the  problems  that  we  consider  in  this  thesis,  we 

associate  each  abstract  node  i  with  a  location  x'  =(yx\,x,1,...x'd\  in  the  J-dimensional 

Euclidean  space  9T7.  In  such  cases,  the  cost  of  arc  (/',  j)  is  simply  the  Euclidean  distance, 


A.  CLASSICAL  RANDOM  GRAPH  MODELS 

The  modem  treatment  of  networks  was  forged  by  Paul  Erdos  and  Alfred  Renyi 
(1959),  who  examined  a  class  of  random  graphs  denoted  G(n,p).  In  this  construction, 
there  are  n  nodes  and  each  node  has  a  probability  p  of  connecting  to  any  other  node  in  the 
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graph.  By  altering  the  parameter  p,  the  measurable  properties  of  the  connectedness  of 
random  graphs  change  quite  suddenly  (see  Bollobas,  1985,  for  an  in  depth  review).  For 
small  values  of  p,  the  graph  demonstrates  low  connectivity  with  several  isolated  nodes. 
Interestingly  though,  as  p  approaches  1/n,  a  majority  of  nodes  form  a  cluster  and  the 
graph  becomes  almost  completely  connected.  For  values  p  « 1 ,  the  graph  becomes 
highly  connected  with  several  cycles.  This  phenomenon  is  known  as  the  “emergence  of 
the  giant  component.”  Figure  1  illustrates  this  phenomenon. 


a-  b.  c. 


Figure  1.  Erdos-Renyi  random  graphs. 

Erdos-Renyi  random  graphs  with  varying  probabilities  for  a  network  with  n  =  50.  A 
small  p  =  1  /  (n  •  In  n) ,  demonstrates  a  largely  disconnected  graph  (a).  A  p  =  \!  n  results 
in  an  almost  connected  graph  (b)  and  a  large  p  =  1  /  In  n  results  in  a  nearly  complete 
graph  (c). 

Another  important  property  is  the  distribution  of  the  node  degrees.  The  degree  of 
node  i,  degi,  follows  a  binomial  (n- 1,  p)  distribution.  For  large  values  of  n,  this 
distribution  can  be  approximated  with  the  Poisson  distribution  (Albert  and  Barabasi, 
2002). 
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B.  RANDOM  GEOMETRIC  GRAPH  MODELS 

Another  approach  to  generating  random  graphs  builds  on  the  notion  of  a 
geometric  graph.  Given  a  set  of  nodes  N  indexed  i  =  1,2 ,...,«,  having  locations 
,  and  a  positive  parameter  r,  the  geometric  graph  G ( A, r )  is  the  undirected 

graph  induced  by  all  arcs  (/,  j)  having  distance  cy  <  r .  When  the  locations 
|x',x2,...x”}  are  the  result  of  an  independent  and  identically  distributed  (IID)  random 
process,  the  resulting  graph  is  called  a  random  geometric  graph. 

Most  of  the  theoretic  results  for  random  geometric  graphs  are  cumbersome, 
especially  in  higher  dimensions  (see  Penrose,  2003,  for  an  in-depth  treatment).  Unlike 
classical  Erdos-Renyi  graphs  in  which  the  presence  of  arcs  is  independent,  the  role  of 
proximity  in  random  geometric  random  graphs  makes  the  appearance  of  (nearby)  arcs 
dependent.  However,  these  graphs  share  remarkably  similar  behavior  in  the  emergence 
of  the  giant  component  (see  Goel,  Rai,  and  Krishnamachari,  2003,  and  references 
therein). 

Random  geometric  graphs  are  often  used  in  classification  problems  in  statistics. 
For  example,  suppose  that  individuals  have  d  characteristics  and  each  can  be  represented 
by  a  continuous  variable  (which  may  not  be  true  in  practice).  Using  an  appropriately 
defined  measure  of  distance  in  this  ^/-dimensional  space,  one  can  classify  two  individuals 
as  being  “similar”  if  their  distance  is  less  than  some  constant  parameter  r.  This  approach 
makes  it  possible  to  identify  clusters  of  similar  individuals,  which  can  be  useful  in  many 
practical  applications. 

Figure  2  illustrates  the  features  of  a  random  geometric  graph. 
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a.  b.  c. 


Figure  2.  Random  geometric  graphs. 

Random  geometric  graphs  of  n  =  50  nodes  resulting  from  varying  values  of  the 
parameter  r.  A  sparse  graph  (a)  results  when  r  =  0.1.  The  emergence  of  the  giant 
component  when  r  =  0.2  (b).  A  highly  connected  graph  (c)  for  r  =  0.5. 

C.  PREFERENTIAL  ATTACHMENT  MODELS 

Unlike  the  graphs  produced  in  the  Erdos-Renyi  model,  the  degree  distribution  of 
many  real  world  networks  does  not  follow  a  Poisson  distribution.  Albert  and  Barabasi 
(2002)  observe  that  many  networks  have  a  skewed  distribution,  in  which  the  majority  of 
nodes  have  small  node  degrees  while  very  few  nodes  have  high  degrees.  The 
connectivity  of  these  networks  can  be  characterized  by  a  power-law  distribution,  in  which 
the  probability  that  a  node  has  a  degree  distribution  k  is  P{k)  «  k~r ,  where  typically  2  <  y 
<3. 

Research  in  a  multitude  of  disciplines  has  demonstrated  power  law  distributions 
within  networks.  Price  (1965)  demonstrates  that  the  network  of  bibliographic  citations  in 
scientific  publications  produces  a  heavy  tailed  distribution.  West  (1999)  argues  that 
several  characteristics  of  biological  systems,  such  as  metabolic  rate,  sizes  and  time  scales 
can  be  modeled  with  a  power-law  for  several  species.  Faloutsos,  Faloutsos  and  Faloutsos 
(1999)  argue  that  power-laws  could  be  used  to  predict  characteristics  of  the  Internet 
topology.  In  finance  and  economics,  Gabaix  (2009)  provides  a  good  summary  of  power 
law  distributions  exhibited  in  a  variety  of  areas  such  as  firm  size,  city  size,  and  the 
distribution  of  income  and  wealth. 
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Since  the  random  graph  model  does  not  produce  the  power-law  distribution  of 
node  degrees  as  observed  in  real  world  networks,  Barabasi  and  Albert  (1999)  present  a 
different  model  based  on  preferential  attachment.  This  model  differs  from  the  random 
graph  model  by  accounting  for  network  growth,  and  it  assumes  that  newly  added  nodes 
are  more  likely  to  attach  to  nodes  with  high  connectivity.  The  probability  p  that  the  new 
node  will  attach  to  node  i  depends  on  the  connectivity  of  node  i,  where  connectivity  is  the 
proportion  of  node  /'s  degree  to  the  sum  of  the  degrees  of  all  other  nodes,  such  that  p  = 
degi  /  fdegj.  This  model  of  network  formation  produces  a  scale-free  network,  a  graph 
whose  resulting  node  degree  distribution  follows  a  power-law.  For  some  systems,  the 
scale-free  network  produced  by  the  model  is  more  similar  in  its  connectivity  than  a  graph 
generated  from  the  random  graph  model.  Figure  3  shows  the  formation  of  a  1000  node 
model  via  preferential  attachment,  while  Figure  4  displays  the  resulting  power-law  of  the 
node  degree  distribution. 
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C.  d. 

Figure  3.  Graph  formed  from  preferential  attachment  model. 

Growth  of  a  graph  with  n  -  1000  nodes  using  the  preferential  attachment  model  described 
by  Barabasi  and  Albert  (1999).  Nodes  initially  are  added  to  the  network  (a).  Initial  hubs 
begin  to  form  (b).  Larger  hubs  are  easily  identifiable  for  networks  with  100  nodes  (c) 
and  1,000  nodes  (d). 
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Nodes  are  ranked  based  on  their  node  degree  from  1  to  1000,  with  the  node  with  the 
highest  degree  ranked  as  1,  then  plotted  versus  their  degree.  Because  of  the  large  number 
of  nodes  with  small  node  degrees,  there  is  a  lack  of  resolution  in  this  region  of  the  plot. 

D.  OPTIMIZATION-BASED  MODELS 

Doyle  and  Carlson  (1999)  propose  a  different  mechanism,  called  highly  optimized 
tolerance  (HOT)  that  produces  power-law  distributions.  They  suggest  that  complex 
networks  are  optimized  for  robust  performance  and  that  the  observed  power  law 
distributions  are  a  result  of  the  tradeoffs  that  must  be  made  due  to  system  constraints. 
Key  features  of  their  HOT  model  include  “(1)  high  efficiency,  performance  and 
robustness  to  designed-for  uncertainties;  (2)  hypersensitivity  to  design  flaws  and 
unanticipated  perturbations;  (3)  nongeneric,  specialized,  structured  configurations;  and 
(4)  power  laws”  (Doyle  and  Carlson,  1999,  p.  1413). 

Fabrikant,  Koutsoupias  and  Papadimitriou  (2002)  suggest  a  simple  model,  which 
we  will  refer  to  as  the  FKP  model,  for  network  formation  that  is  based  on  the  “tradeoff” 
concept  present  in  the  HOT  model.  Like  the  Barabasi- Albert  model,  they  grow  a  network 
one  node  at  a  time,  but  they  also  give  each  node  a  location  in  the  unit  square.  When 
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deciding  to  which  node  in  the  network  the  new  node  should  attach,  they  propose  two 
logical  considerations.  First,  they  assume  the  node  would  want  to  minimize  its 
connection  “cost”  (represented  by  the  Euclidian  distance  between  itself  and  the  node  it 
attaches  to).  The  second  is  that  the  node  would  desire  to  connect  to  one  that  is  more 
centrally  located.  These  objectives  can  be  weighted  in  order  to  alter  the  relative 
importance  between  the  two.  Specifically  in  their  model,  node  i  will  attach  to  node  j  in 
order  to  satisfy  the  objective: 

min  a  •  c,  +  h-  (1) 

j:j<i  1  1 

where  cy  is  the  Euclidean  distance  between  nodes  i  and  j  and  hj  is  a  measure  of  centrality 
for  node  j.  The  weighting  factor,  a  >  0,  is  usually  defined  as  a  function  of  the  final 
number  of  nodes  n.  The  centrality  h  can  be  defined  in  several  ways,  such  as  the  average 
number  of  hops  to  all  other  nodes,  the  average  Euclidian  distance  to  all  other  nodes  or  the 
distance  to  some  predefined  central  node  (Fabrikant  et  ah,  2002). 

Fabrikant  et  al.  (2002)  show  that  by  varying  the  value  of  a,  graphs  with  very 
different  properties  result.  They  prove  that  when  centrality  is  measured  as  the  number  of 

hops  to  a  defined  node,  nO,  then  for  a  <  1/  V2 

,  distance  is  relatively  insignificant 
compared  to  centrality,  and  the  resultant  network  is  a  star  with  the  center  at  nO..  As  a 

V'Tt 

approaches  ,  there  is  a  closer  trade-off  between  distance  and  centrality,  and  the  node 
degree  distribution  can  be  represented  by  a  power  law.  Once  a  exceeds  Jn  ,  distance 
becomes  the  overriding  factor,  and  a  form  of  a  Euclidean  spanning  tree  results.  Figure  5 
displays  1000  node  networks  formed  with  a  =  .75,  5  and  32  («  >/l000  ).  Figure  6 
demonstrates  the  power  law  that  results  when  a  is  approximately  yfn  . 
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a.  b.  c. 


Figure  5.  Some  realizations  of  FKP  networks  with  n  =  1000. 

The  star  results  (a)  with  a  small  a  =  0.75.  Large  clusters  are  evident  (b)  when  a  =  5.  A 
network  whose  node  degree  distribution  can  be  represented  by  a  power-law  (c)  results 
when  a  =  32  («  yfn  ). 


Figure  6.  Degree  distribution  of  an  FKP  network. 


Network  has  n  =  1000  nodes  and  a  =  32  («  yfn  ).  Nodes  are  ranked  based  on  their  node 
degree  from  1  to  1000,  with  the  node  with  the  highest  degree  ranked  as  1,  then  plotted 
versus  their  degree.  Because  of  the  large  number  of  nodes  with  small  node  degrees,  there 
is  a  lack  of  resolution  in  this  region  of  the  plot 
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The  FKP  model  introduces  a  novel  idea  for  network  fonnation.  Unlike  the  Erdos- 
Renyi  graphs  that  are  entirely  based  on  a  random  selection  of  arcs,  this  model  suggests  a 
highly  organized,  locally  optimized  model.  The  model  also  demonstrates  a  power-law  in 
the  node  degree  distribution  as  seen  in  the  Barabasi-Albert  model,  although  Berger, 
Bollobas,  Borgs,  Chayes,  and  Riordan  (2003)  argue  that  the  resulting  distribution  is  not  a 
strict  power-law,  but  has  an  exponential  cutoff. 

E.  GAME  THEORETIC  MODELS 

Another  approach  to  network  fonnation  uses  game  theory.  Fabrikant,  Luthra, 
Maneva,  and  Papadimitriou  (2003)  propose  a  network  formation  game  to  explore  how  an 
undirected  network  created  from  selfish-acting  nodes  would  affect  the  network 
performance  as  a  whole. 

The  game  is  as  follows:  There  are  n  players,  each  representing  a  node  in  the 
network.  The  entire  set  of  players  is  N,  with  \N\  =  n  .  Each  player  i  e  1,2 ,.,.,/z  chooses 

a  strategy  set  s.  =  {.s';l,.s'i2,...,.s';/,...,.s'm| ,  which  defines  the  network  edges  to  build  from  i  to 
other  nodes  j  el,2,...,n  .  The  sets  =  {sl,s2,...,sn}  denotes  the  collective  strategy  of  all 
players. 

Let  A(s)  be  the  set  of  arcs  resulting  from  strategy  s.  Therefore, 
A(s)  =  {(z,y)  :i  *  j,  sy  =  1  or  s/7  =  1 1  and  G(s)  =  (N,A(s))  is  the  undirected  graph  that 
results  from  strategy  5.  Once  a  strategy  is  chosen,  each  player  zel,2,...,«  incurs  a 
cost  c;.  (.s')  =  a  ■  |.s'(|  +  ^  clu  n(G(s))  where  a  is  the  fixed  cost  of  forming  a  single  connection 

j'gN 

between  two  players,  and  d(ij)(G(s))  is  the  distance,  measured  in  hop  count,  between 
nodes  i  and  j  in  the  resulting  graph  G’(.s’) .  If  no  path  exists  between  i  and  j,  then 
d(i  j){G{s))  =  qo.  This  is  called  the  Unilateral  Connection  Game  (UCG)  because  each 
node  is  able  to  use  an  arc,  regardless  of  who  paid  it. 

An  extension  of  this  game  is  the  Bilateral  Connection  Game  (BCG)  described  by 
Corbo  and  Parks  (2005).  The  major  difference  from  the  UCG  is  that  in  the  BCG  an  arc 
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only  forms  if  both  nodes’  strategies  contain  that  arc.  So  in  this  game 
A(s)  =  {(i,j) :  i  *  j,  Sjj  =  1  and  s/(.  =  lj  and  any  connection  cost  is  shared  equally  between 
the  two  nodes. 

In  both  cases,  the  social  cost  of  the  network  is  compared  to  that  of  a  Nash 
equilibrium.  The  Nash  equilibrium  is  a  strategy  s  that  satisfies 

C,.  (s)  =  c,  (5,. ,  sm )  <  C,.  (s',  smi )  Vz  e  N,  Sj  e  St . 

In  other  words,  at  the  Nash  equilibrium,  no  node  has  incentive  to  change  its  strategy.  The 
social  cost  of  the  network  is  defined  as: 

a-\A(s)\+  Y  d0J)(G(s))  for  the  UCG 

i,jeN 

2cr\A(s)\+  Y  d(iJ](G(s))  for  the  BCG. 

ije  N 

The  term  price  of  anarchy  is  the  ratio  of  the  social  costs  of  the  worst  case  Nash 
equilibrium  and  the  social  optimum  (Koutsoupias  and  Papadimitriou,  1999; 
Roughgarden,  2005)  and  is  used  to  measure  the  lack  of  coordination  when  the  nodes  act 
selfishly. 


C(G(j))  =  5>,(j)  = 

i<=N 


Fabrikant  et  al.  (2003)  show  that  the  results  of  the  UCG  vary  based  on  the  value 
of  the  parameter  a.  When  a  <  1,  the  social  optimum  is  a  complete  graph,  and  this  is  the 
only  Nash  equilibrium.  When  1  <  a  <  2,  the  complete  graph  still  results  in  a  Nash 
equilibrium,  but  it  is  no  longer  unique.  The  worst  Nash  equilibrium  is  the  star,  leading  to 


a  price  of  anarchy  of 


C(star) 


C (complete  graph )  3 


<  —  .  When  a  >  2,  the  social  optimum  is  a  star, 


although  there  can  be  worse  Nash  equilibria. 

Corbo  and  Parkes  (2005)  build  upon  the  equilibrium  concept  to  define  a  pairwise 

Nash  equibrium  as  a  strategy  s  that  supports  ^(s)  as  a  ]\jash  equilibrium  and  for  all 

(i  i)gA(s)  r-  c,(s  +  A,,. ,..)  <  c,.(s)  ci(s  +  A..n)>ci(s)  ,  A,... 

u,j)j  'v  >  then  jK  ’  where  (t’J)  represents  a 

strategy  consisting  of  only  the  arc  (i,j).  They  also  define  a  network  as  being 

pairwise  stable  if  for  all  j)  G  ^(s) ,  c,  (s  \ij))-ci(s)  wp, j ] e  for  aq  (Uj)£A(s)^  jf 


c(  (s  +  A(i  J))  ct(s)  cj(s  +  A(ij))  >  cj(s\  Similarly  to  the  UCG,  they  prove  that  for 


a  <  1,  the  complete  graph  is  the  only  efficient  and  pairwise  stable  graph.  For  a  >  1,  the 
star  network  is  the  only  efficient  graph,  but  although  stable,  it  is  not  unique.  They  also 
propose  that  the  price  of  anarchy  for  the  BCG  is  worse  than  that  of  the  UCG  (Corbo  and 
Parkes,  2007). 


Fabrikant  et  al.  and  Corbo  and  Parkes  focus  on  different  properties  of  the 
networks  formed  from  the  UCG  and  BCG  than  the  previously  reviewed  models.  Similar 
to  the  HOT  model,  the  connection  cost  is  associated  with  Euclidian  distance  and  the 
number  of  arcs  in  the  network  and  by  tuning  the  weighting  factor  a,  networks  ranging 
from  the  complete  graph  to  a  star  can  be  produced.  However,  unlike  the  previous 
models,  they  concentrate  on  quantifying  the  cost  associated  with  selfish  behavior  to 
compare  it  to  the  social  optimum.  Also,  the  BCG  introduces  a  unique  feature  of 
restricting  the  arcs  in  network  to  those  formed  through  the  consent  of  the  two  nodes. 
Both  models  provide  an  interesting  way  of  looking  at  network  formation. 


F.  DISCUSSION 

The  key  insight  of  Fabrikant  et  al.  (2002)  is  that  the  power  laws  observed  in  the 
structure  of  many  man-made  and  natural  systems  can  result  from  design  tradeoffs  that 
can  be  captured  in  simple  optimization  models.  Their  model  was  inspired  by  tensions 
perceived  in  the  Internet — a  desire  to  minimize  the  cost  of  connecting  while  also  wanting 
to  have  low  delay  (i.e.,  be  “central”)  when  communicating.  But  their  model  reflects  a 
local,  myopic  decision  process.  It  is  unclear  how,  if  at  all,  this  local  process  relates  to  the 
global  behavior  of  the  network. 

The  price  of  anarchy  in  the  study  of  network  formation  games  addresses  explicitly 
the  difference  between  the  social  optimum  for  some  system  (as  achieved,  for  example,  by 
a  central  decision  maker)  with  the  aggregate  outcome  of  local  agents.  Fabrikant  et  al. 
(2002)  focus  on  the  global  connectivity  properties  (e.g.,  degree  distributions)  of  the 
graph,  but  is  there  an  interpretation  for  a  system- wide  objective? 
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In  the  case  of  large  a  the  objective  (1)  emphasizes  only  the  local  connection  cost, 
and  it  is  possible  to  interpret  the  collective  behavior  as  trying  to  minimize  the  distance  of 
the  resulting  tree,  albeit  in  a  heuristic  manner.  With  this  in  mind,  we  now  consider  the 
classic  minimum  spanning  tree  problem. 
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III.  THE  MINIMUM  SPANNING  TREE  PROBLEM 


A.  INTRODUCTION  TO  MINIMUM  SPANNING  TREE 

A  classic  problem  in  the  study  of  networks  is  the  minimum  spanning  tree  (MST) 
problem  with  the  first  algorithm  for  solving  the  MST  published  by  Otakar  Boruvka  in 
1926  (see  Graham  and  Hell,  1985,  for  history  of  MST).  Minimum  spanning  trees  have 
several  practical  applications  such  as  determining  the  minimum  amount  of  wire  to 
connect  several  electrical  components  or  calculating  the  minimum  amount  of  piping 
required  to  connect  houses  in  a  neighborhood  to  a  water  system. 

The  MST  problem  for  a  network  G(N,A)  can  be  treated  as  an  optimization 
problem  and  easily  be  fonnulated  as  an  integer  linear  program.  The  fonnulation  follows, 
where  H  c;  A  and  the  arc  set  for  H,  A(H)  c=  A  . 

1.  Indices 

i  e  N  node  (/  =  1,2,...  ,n)  (alias  j) 

(/,  /)  e  A  undirected  arc  between  node  i  and  node  j 

2.  Data 

Cij  cost  of  arc  (/,  j) 

3.  Decision  Variable 

Zy  binary  variable  indicating  if  arc  (/,  j)  is  in  tree 

4.  Formulation 


min  V  c,,Z.. 

7  i—l  V  U 

(3.1) 

(i U 

(3.2) 

y  Zy<\H\- 1  V  sets  H  <^N 

(3.3) 

(iJ)cA(H) 

z„A  o.i} 

(3.4) 
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The  objective  function  (3.1)  minimizes  the  costs  of  the  arcs  chosen.  Equation  3.2 
is  a  cardinality  constraint  that  ensures  that  only  n- 1  arcs  are  selected,  while  Equation  3.3 
ensures  that  there  are  no  resulting  cycles.  Although  this  problem  is  simple  to  formulate, 
solving  it  with  linear  programming  is  nontrivial  for  large  n.  The  number  of  sets 
H  c=  A  grows  exponentially  with  n,  so  the  total  number  of  constraints  arising  from 
Equation  3.3  becomes  exponential,  making  the  problem  increasingly  difficult  to  solve  as 
n  grows. 

However,  the  MST  has  a  special  tree  structure  that  results  in  two  necessary  and 
sufficient  conditions  to  prove  that  a  tree  is  a  MST  (see  Ahuja  et  ah,  1993,  pp.  518-519, 
for  a  detailed  discussion).  The  first  is  the  cut  optimality  condition,  which  states: 

A  spanning  tree  T  is  a  MST  if  and  only  if  for  every  tree  arc  (/,  /)  e  T  ,  ctj  <  ckl  for 

every  arc  (kj)  contained  in  the  cut  fonned  by  deleting  arc  (z,  /)  from  T. 

The  second  is  the  path  optimality  condition,  which  states: 

A  spanning  tree  T  is  a  MST  if  and  only  if  for  every  nontree  arc  ( k,l )  e  A  of  G, 
Cj  <  ckl  for  every  arc  (/',  /)  contained  in  the  path  in  T  connecting  nodes  k  and  /. 

Using  these  two  principles  for  optimality,  one  can  obtain  simpler  algorithms  that  solve 
the  MST.  Kruskal ’s  algorithm  and  Prim 's  algorithm  are  two  popular  methods  that  use 
global  infonnation  of  the  network  to  solve  for  the  MST.  A  novel,  decentralized 
algorithm  that  utilizes  infonnation  only  known  locally  to  individual  nodes  has  been 
proposed  by  Gallager,  Humblet  and  Spira  (1983).  The  following  sections  of  this  chapter 
discuss  these  algorithms. 
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B.  KRUSKAL’S  ALGORITHM 

Kruskal’s  algorithm  (1956)  directly  utilizes  the  path  optimality  condition  to  build 
the  MST  one  arc  at  a  time.  The  algorithm  does  this  by  maintaining  two  lists  of  arcs.  The 
algorithm  initializes  by  sorting  all  the  arcs  in  increasing  order  of  cost  and  placing  them  in 
a  list,  called  SORTED.  A  second  list,  called  FINAL,  is  initially  empty.  The  algorithm 
proceeds  by  examining  each  arc  in  SORTED  and  either  adding  it  to  the  FINAL  list  or 
discarding  it.  The  algorithm  begins  by  adding  the  first  arc  in  SORTED  (having  the 
smallest  cost)  to  FINAL.  Then  for  each  subsequent  arc,  the  algorithm  examines  whether 
adding  it  to  FINAL  would  create  a  cycle.  If  adding  the  arc  would  create  a  cycle,  it  is 
discarded.  Otherwise,  it  is  added  to  FINAL.  This  process  continues  until  there  are  n-1 
arcs  in  FINAL.  This  algorithm,  as  presented,  requires  0(m  log  n)  time  to  sort  the  arcs 
and  O(mn)  time  to  detect  a  cycle,  although  Ahuja  et  al.  (1993)  provide  a  more  efficient 
algorithm  that  operates  in  0(m  +  n  log  n)  time. 

Figure  7  illustrates  Kruskal’s  algorithm  on  an  undirected  graph  having  n  =  10 
nodes,  each  positioned  in  the  unit  square.  The  cost  of  the  arc  between  nodes  i  and  j  is  the 
Euclidian  distance  between  the  nodes.  Here,  we  assume  no  restrictions  in  this  example, 
so  each  node  can  connect  to  any  other  node.  Figure  7a  shows  the  nodes  in  the  initial 
empty  graph.  Figure  7b  illustrates  the  first  two  smallest  arcs  between  nodes  nl  and  n9  as 
well  as  n3  and  n5.  The  first  arc  in  SORTED  that  would  create  a  cycle  is  the  (nl,n3)  arc 
as  shown  in  Figure  7c.  This  arc  violates  the  path  optimality  condition,  therefore  this  arc 
is  discarded.  The  algorithm  examines  each  arc  in  turn  and  adds  it  if  doing  so  does  not 
create  a  cycle.  Finally,  when  the  number  of  arcs  equals  n- 1  (9  in  this  example),  the 
algorithm  terminates  and  the  MST  results  (Figure  7d). 
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c.  d. 

Figure  7.  Kruskal’s  algorithm 


The  initial  network  has  no  arcs  (a).  The  smallest  arcs  are  added  first  (b).  The  (nl,n3)  arc 
is  would  create  a  cycle  (c)  so  it  is  discarded.  The  MST  results  (d). 

C.  PRIM’S  ALGORITHM 

Prim’s  algorithm  (1957)  is  based  on  the  cut  optimality  condition.  It  initiates  with 
a  cut  in,  which  an  arbitrary  start  node  of  the  network  is  in  subset  K,  while  the  remainder 
of  the  nodes  are  in  subset  K .  The  minimum-weight  arc  from  the  start  node  is  then  added 
to  the  list  of  MST  arcs,  and  the  head  node  of  that  arc  is  removed  from  K  and  placed  in  K 
creating  a  new  cut.  The  minimum-weight  edge  of  all  nodes  in  K  that  connects  to  a  node 
in  K  is  then  added  to  the  MST  with  its  head  node  moving  from  K  to  K .  This  method 
continues  to  create  cuts  between  the  two  subsets  until  all  nodes  have  been  placed  into  K 

and  the  resulting  MST  list  will  contain  n—  1  arcs.  Prim’s  algorithm  requires  0(mn )  time 
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because  of  the  time  required  to  search  for  the  minimum  arc  in  the  cut.  Ahuja  et  al.  (1993) 
also  present  a  more  efficient  data  structure  that  can  reduce  the  time  to  0(m  +  n  log  n ). 

Figure  8  illustrates  Prim’s  algorithm  on  an  undirected  graph  having  n  =  10  nodes, 
each  with  the  same  coordinates  in  the  unit  square  as  in  the  previous  example.  For  this 
algorithm,  any  node  can  be  chosen  to  initiate  the  algorithm,  but  for  this  example,  node  n  1 
initiates  the  algorithm.  Figure  8a  shows  that  n  1  is  in  the  set  K  whereas  the  remainder  of 
the  nodes  are  in  K .  The  hashed,  curved  line  indicates  the  cut  in  the  graph,  and  the  three 
smallest  cost  arcs  in  the  cut  are  illustrated,  although  all  arcs  from  n  1  are  technically  in  the 
cut.  The  (nl,n5)  arc  has  the  minimum  cost  in  the  cut,  so  it  is  added  to  the  MST  and  the 
n5  node  moves  from  K  lo  K  as  demonstrated  in  Figure  8b.  Figure  8b  also  indicates  the 
four  least  cost  arcs  in  the  new  cut  and  arc  (n5,n3)  is  the  minimum,  so  it  is  added  (Figure 
8c).  This  process  continues  to  use  the  cut  optimality  condition  (Figures  8c-e)  until  all 
nodes  are  in  the  same  set  and  the  MST  results  (Figure  8f). 
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c.  d. 


e.  f. 


Figure  8.  Prim’s  algorithm 

The  algorithm  begins  with  one  node  in  subset  K  and  the  rest  in  K  (a).  The  minimum  cost 
arc  in  the  cut  (denoted  by  the  dashed  line)  is  added  to  the  MST  (b)  and  arcs  in  the  new  cut 
are  compared.  It  proceeds  by  continually  adding  the  minimum  arc  in  the  new  cuts  (c-e) 
until  the  MST  is  produced  (f). 
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D.  DECENTRALIZED  ALGORITHM 


More  recently,  focus  has  turned  to  decentralized  methods  for  solving  the  MST 
problem.  Gallager,  Humblet  and  Spira  (1983)  present  a  distributed  algorithm  that  also 
uses  the  path  optimality  condition  to  solve  this  problem  in  an  undirected  network.  Their 
algorithm  relies  on  a  node’s  localized  information  and  its  ability  to  receive,  process  and 
send  “messages.”  The  complexity  of  this  algorithm  is  therefore  measured  by  the  number 
of  messages  that  are  passed  which  is  at  most  is  at  most  5n  log2  n  +  2m. 

Each  node  maintains  an  individual  queue  to  store  its  incoming  messages,  and  it 
processes  them  in  first-in  first-out  order.  The  algorithm  works  by  combining  separate 
graph  fragments  together  into  a  final  MST.  Initially  each  node  is  its  own  fragment,  and 
then  through  the  message  passing  process,  it  combines  with  other  nodes  to  create  new 
fragments,  finally  combining  into  a  final  fragment  containing  the  MST.  By  passing 
messages,  each  node  eventually  discovers  which  of  its  arcs  are  in  the  MST. 

Throughout  this  algorithm,  the  actions  that  a  node  initiates  upon  receiving  a 
message  depends  on  its  state  and  the  arc  upon  which  it  sends  its  message  depends  on  the 
state  and  weight  of  its  arcs.  A  node  has  one  of  three  states.  The  Sleep  state  is  the  initial 
state  for  all  nodes,  the  Find  state  is  when  the  node  is  trying  to  find  its  fragment’s 
minimum-weight  edge  and  the  Found  state  is  for  all  other  instances. 

Other  information  that  a  node  maintains  is  its  fragment  identification,  which  is  the 
value  of  the  minimum  cost  arc  joining  two  fragments,  and  its  current  level.  The  level  is 
used  to  control  which  fragments  can  connect.  Only  a  fragment  with  a  lower  level  can  be 
absorbed  by  a  higher  level  fragment  (not  vice-versa)  or  two  fragments  of  the  same  level 
can  join.  This  imposes  an  ordering  on  the  way  that  fragments  merge. 

The  algorithm  proceeds  with  nodes  passing  messages  to  one  another  to  create 
fragments  along  their  minimum-weight  edges.  Each  node  maintains  its  own  fragment 
identification,  and  can  send  this  information  in  a  message.  That  way,  when  a  node 
receives  a  message,  it  knows  if  it  came  from  its  own  fragment  or  a  new  fragment.  Once 
fragments  have  been  formed,  the  nodes  continue  to  pass  messages  along  their  arcs  to  (1) 
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determine  the  minimum-weight  edge  of  the  fragment  and  (2)  see  if  the  receiving  node  is 
part  of  a  new  fragment.  In  this  manner,  the  fragments  continue  to  merge  until  they  have 
obtained  the  MST. 

To  keep  track  of  which  arcs  are  involved  in  the  MST,  nodes  maintain  the  states  of 
their  arcs.  Each  arc  also  has  one  of  three  states.  Initially,  all  arcs  are  in  the  Basic  state, 
indicating  they  have  potential  to  be  a  part  of  the  MST.  If  an  arc  is  in  the  Branch  state  it 
has  been  identified  part  of  the  MST.  And  lastly,  if  it  is  in  the  Rejected  state,  then  the 
node  has  determined  that  it  is  not  in  the  MST,  but  rather  an  arc  connecting  two  nodes  in 
the  same  fragment. 

Figure  9  provides  an  overview  (see  Gallager  et  al.  1983,  for  detailed  pseudo-code) 
of  how  the  decentralized  algorithm  results  in  the  MST  for  the  same  undirected  10  node 
network  used  above.  Again,  it  is  assumed  that  each  node  can  attach  to  every  other  node 
in  the  network,  with  the  weight  of  the  arc  between  nodes  i  and  j  equal  to  the  Euclidian 
distance  between  those  nodes.  Although  this  is  an  undirected  network,  each  node 
maintains  the  state  of  its  arcs,  so  directional  arcs  will  be  used  to  illustrate  the  arc  state 
maintained  by  the  tail  node.  It  is  possible  for  node  i  and  node  j  to  have  different  states  for 
arc  (i,  j) ,  but  the  discrepancy  is  resolved  via  the  message  passing  and  does  not  interfere 
with  the  algorithm. 

Figure  9a  shows  the  initialization  of  the  algorithm,  in  which  nodes  n  1  and  n  1 0 
undergo  a  “wakeup”  procedure  that  initializes  their  level  to  0  and  changes  their  state  from 
Sleep  to  Found.  Like  KruskaTs  algorithm,  these  nodes  each  identify  their  minimum 
weight  arc,  (nl,«5)  and  (n\  0.//2)  respectively,  and  label  them  as  part  of  the  MST  by 
changing  its  state  to  Branch.  Because  the  heads  of  these  minimum-weight  arcs  are  in  the 
Sleep  state,  they  too  must  undergo  the  “wakeup”  procedure  and  find  their  minimum- 
weight  edges.  In  this  example,  nodes  n5  and  n2  initialize  to  level  0  and  their  minimum- 
weight  edges  are  (n5,n3)  and  (//2,/zl  0) ,  respectively.  This  process  continues  until, 
through  the  message  passing  process,  two  nodes  find  that  their  minimum-weight  edges 
are  one  in  the  same.  Because  the  nodes  are  at  the  same  level,  they  combine  to  form  a 
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fragment  at  a  higher  level,  and  the  fragment  identification  is  the  cost  of  the  minimum- 
weight  edge.  Figure  9b  illustrates  the  node  and  arc  states  at  this  point. 

Through  the  message  passing,  nodes  nl  and  n  1 0  recognize  that  they  are  in  the 
same  fragment  as  the  MST  and  because  they  are  the  same  level,  they  will  each  set  their 
fragment  identification  as  0.265  (the  weight  of  the  connecting  arc)  and  increase  their 
level  to  1  as  well  as  changing  their  state  to  Find.  Nodes  n3  and  n5  undergo  a  similar 
process,  increasing  their  level  to  1  and  labeling  their  fragment  identification  as  0.163. 
Node  n5  will  also  act  upon  a  message  received  by  n  1 .  Because  n  1  is  still  at  level  0,  it  will 
absorb  into  the  n3-n5  (0.163)  fragment  assuming  the  fragments  level  and  identification  as 
well  as  changing  its  state  to  Find. 

When  the  nodes  are  in  the  Find  state,  they  are  actively  searching  for  other  nodes 
to  add  to  their  own  fragment.  They  do  this  by  passing  a  “Test”  message  along  its 
minimum-weight  arc  in  the  Basic  state.  The  receiving  node  compares  its  fragment 
identification  to  that  of  the  sending  node.  If  they  are  the  same,  the  arc  between  them  is 
placed  in  a  Rejected  State  for  both  nodes,  and  the  sending  node  sends  a  “Test”  message 
on  the  next  best  minimum-weight  edge.  Figure  9c  shows  the  partial  graph  formation. 
The  bold  arcs  are  those  that  both  head  and  tail  nodes  recognize  their  state  as  Branch  and 
the  gray  directional  arcs  represent  the  arcs  a  “Test”  message  is  sent  across.  Nodes  n6  and 
n3  both  tested  nodes  n2  and  n  1  respectively.  Since  each  of  these  nodes  were  in  the  others 
fragment,  the  arc  was  rejected  (the  dotted  arc)  and  a  new  “Test”  message  is  sent  via  the 
next  best  arcs  in  the  Basic  state,  nodes  n  8  and  nl. 

The  process  continues,  resulting  in  three  primary  fragments  for  this  example 
(Figure  9d).  The  message  passing  continues  between  nodes  within  the  same  fragment,  to 
identify  the  minimum- weight  arc  to  connect  to  a  node  in  a  different  fragment.  For 
simplicity,  we  will  focus  on  the  message  from  n3  to  ril.  The  nodes  have  different 
fragment  identifications  and  are  at  the  same  level,  so  they  will  combine  the  fragments 
along  arc  (n3,nl) ,  increasing  their  level  to  2  and  assuming  the  new  fragment  identity  of 
0.266,  the  weight  of  the  adjoining  arc.  They  will  then  pass  messages  along  their  Branch 
arcs  to  the  other  nodes  in  their  fragment  so  they  will  update  their  levels  and  fragment 
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identifications  accordingly.  Figure  9e  illustrates  the  graph  once  this  has  occurred.  The 
number  of  rejected  arcs  at  this  instance  are  numerous  and  are  not  illustrated.  Lastly,  the 
remaining  two  fragments  combine  and  the  result  is  the  MST  (Figure  9f). 
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a.  b. 


c.  d. 


e.  f. 

Figure  9.  Decentralized  algorithm  for  finding  the  MST 

All  nodes  in  the  initial  network  (a)  are  in  the  Sleep  state.  Nodes  nl  and  n  1 0  wake  up  and 
label  their  minimum  arc  Branch  (b).  Fragments  form  and  their  nodes  pass  messages  to 
identify  the  minimum  arc  to  connect  them  to  each  other  (c-e)  until  the  MST  results  (f). 
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E.  DISCUSSION 

The  MST  problem  is  simple  to  state  and  solve  as  a  global  optimization  problem. 
However,  as  a  local,  decentralized,  and  asynchronous  process,  it  is  considerably  more 
complicated.  The  algorithm  of  Gallager  et  al.  (1983)  shows  that  individual  nodes  making 
local  decisions  can  solve  this  problem  correctly  and  efficiently.  But,  is  there  a  way  to 
interpret  this  process  as  a  local  optimization  problem?  In  the  next  chapter,  we  consider 
some  numerical  experiments  to  explore  this  possibility. 
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IV.  EXPERIMENTS 


In  the  previous  chapters,  we  examined  two  different  classes  of  network  formation 
problems.  In  Chapter  II,  we  reviewed  a  progression  of  network  models  that  were  based 
on  local  “decisions”  (either  random  or  according  to  some  local  optimization  problem). 
The  research  emphasis  for  these  models  has  been  to  understand  the  global  network 
properties  that  result  from  these  local  decisions.  In  Chapter  III,  we  considered  a  specific 
network  design  problem,  the  MST,  and  reviewed  both  global  and  local  techniques  for 
solving  it.  In  this  chapter,  we  attempt  to  reconcile  these  two  perspectives  by  considering 
two  basic  issues. 

A.  THE  ROLE  OF  PRECEDENCE  IN  LOCAL  NETWORK  FORMATION 

In  many  of  the  network  models  considered  here,  the  network  is  not  constructed  all 
at  once,  but  rather  grows  incrementally  through  the  addition  of  nodes  and  arcs.  A  basic 
question  of  interest  is,  What  is  the  role  of precedence  in  these  network  formation  models? 
We  consider  each  of  these  network  models  in  turn. 

Random  graph  models.  The  Erdos-Renyi  random  graph  model  has  one 
parameter,  p,  the  probability  a  node  connects  to  any  other  node  in  the  graph.  The  random 
geometric  graph  model  (with  node  locations  generated  from  an  IID  random  process)  also 
has  one  parameter  r,  such  that  arcs  form  between  nodes  with  distances  cy  <  r.  These 
parameters  are  not  affected  by  the  order  that  nodes  are  introduced,  and  actually,  nodes 
“arrive”  at  the  same  time  in  both  of  these  models.  Therefore,  in  neither  of  these  cases 
does  the  order  of  node  arrival  affect  the  overall  network  structure. 

Preferential  attachment  models.  In  preferential  attachment  models,  nodes  do 
arrive  one  at  a  time.  The  arriving  nodes  have  a  higher  probability  of  forming  arcs  to 
nodes  that  already  have  several  connections.  In  this  model,  precedence  is  important  in 
that  nodes  introduced  early  on  in  the  network  fonnation  are  much  more  likely  to  acquire 
connections  than  those  introduced  later  on.  However,  the  nodes  are  essentially 
interchangeable,  so  if  the  order  of  the  nodes  were  rearranged,  the  identification  of  the 
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nodes  would  change,  but  the  statistical  properties  of  the  global  network  would  remain  the 
same,  including  the  power-law  distribution  of  the  node  degrees.  So  precedence  does  not 
affect  the  overall  network  characteristics. 

FKP  construction.  The  FKP  model  also  has  nodes  in  unique  locations,  but  in  this 
model  precedence  does  play  a  role.  Whereas  the  random  graph  and  preferential 
attachment  models  use  probabilities  to  detennine  arc  placement,  this  model  adds  arcs 
based  on  which  arc  minimizes  a  node’s  objective  function.  Since  a  newly  added  node 
can  only  utilize  the  nodes  previously  added  to  meet  its  objective  function,  different 
networks  will  result  based  on  the  order  the  nodes  are  introduced  to  the  network.  What  is 
not  known  is  the  extent  to  which  precedence  plays  a  role  in  the  overall  network 
properties. 

Network  formation  games.  In  the  UCG  and  BCG  network  formation  games,  a 
network  is  formed  by  each  node  picking  a  strategy  consisting  of  arcs.  These  models  do 
not  utilize  network  growth,  but  rather  begin  with  all  nodes  present  in  the  network.  It  is 
unclear  how,  or  if  precedence,  would  affect  this  type  of  local  network  fonnation. 

Minimum  spanning  tree  problems.  There  is  no  role  of  precedence  in  the  MST 
problems  either.  With  the  global  algorithms,  such  as  Prim’s  and  Kruskal’s,  the  algorithm 
can  begin  with  any  node  and  result  in  a  MST.  In  the  decentralized  algorithm,  the 
message  passing  mechanism  is  not  affected  by  the  order  in  which  nodes  send  messages. 
Although  these  methods  could  produce  MSTs  with  different  structures,  their  costs  are  all 
equal.  We  focus  on  the  total  network  cost,  so  the  potential  for  different  network 
structures  is  unimportant  in  our  numerical  experiments. 

In  order  to  explore  the  FKP  models  in  more  detail,  we  conduct  two  experiments. 

1.  Reordering  of  Nodes  for  Initial  Construction  of  Network 

We  generate  an  FKP-style  network  of  n  =  100  nodes,  each  with  the  local  objective 
min  a  •  c.. 

function  j:j<i  with  a  -  ‘  .  The  total  network  cost  is  the  sum  of  all  arc  costs.  We 
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compare  this  result  to  those  obtained  from  a  second  network  that  is  generated  in  the  same 
fashion.  Each  node  in  the  second  network  has  the  identical  locations  as  the  nodes  in  the 
first  network. 

We  alter,  in  two  ways,  the  order  in  which  nodes  arrive.  The  first  uses  the  same 
node  sequence,  but  chooses  a  different  start  node.  The  second  method  completely 
randomizes  the  order  of  the  nodes. 

In  both  cases,  the  network  that  forms  is  different  than  the  initial  network  and  the 
total  network  cost  is  similar  to  the  initial  cost.  Therefore  we  conclude  that  precedence 
does  play  a  role  in  the  initial  network  formation.  Figure  10  demonstrates  the  different 
networks  that  result  from  both  types  of  reordering  for  a  20  node  and  a  100  node  network. 
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a.  b. 


c.  d. 


e.  f. 

Figure  10.  Different  FKP-style  networks  form  with  reordering  the  nodes. 

Networks  of  n  =  20  and  n  =  100  nodes  are  generated  from  the  FKP-style  construction 
(a,b).  Different  networks  (c,d)  result  when  a  start  node  is  randomly  selected  but  the  nodes 
are  added  in  the  same  sequence.  Different  networks  (e,f)  result  from  randomizing  the 
sequence  in  which  the  nodes  are  added. 
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2.  Rewiring  of  Nodes  to  Equilibrium 

In  the  second  numerical  experiment,  we  examine  the  effects  of  allowing  the  nodes 
to  change  their  initial  connection.  Once  all  nodes  had  been  introduced  to  the  network, 
we  give  them  an  opportunity  to  improve  its  connection  cost  by  selecting  a  different  node 
to  connect  with.  We  tenn  this  process  rewiring.  Once  none  of  the  nodes  in  the  network 
can  benefit  from  connecting  to  a  different  node  we  say  the  network  is  in  equilibrium.  We 
then  compare  the  cost  of  the  equilibrium  network  to  those  of  the  MST 

We  generate  networks  of  n  =  100  nodes  as  described  above.  Once  all  nodes  are 
added,  we  permit  the  nodes  to  rewire,  subject  to  two  constraints. 

The  first  constraint  is  that  a  node  is  only  able  to  rewire  the  arc  it  formed  when  it 
joined  the  network  and  not  any  of  the  arcs  from  other  nodes  that  attached  to  it.  The 
second  constraint  is  that  the  node  can  only  rewire  to  a  node  that  maintains  the 
connectivity  of  the  entire  network. 

We  alter,  in  two  ways,  the  order  that  the  nodes  rewire.  We  first  give  the  nodes  the 
opportunity  to  rewire  in  the  same  sequence  they  arrived  in  the  network.  In  the  second 
way,  we  randomly  select  which  node  can  rewire.  Figure  1 1  illustrates  the  initial  network 
and  the  network  once  it  has  reached  equilibrium. 
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a-  b. 


Figure  1 1 .  Networks  with  FKP-style  construction  in  equilibrium 


The  20  and  200  node  networks  from  Figure  10  reach  equilibrium  by  sequentially  rewiring 
their  initial  arcs  (a,b).  Different  equilibrium  networks  (c,d)  result  from  a  random 
rewiring  process.  Neither  equilibrium  network  results  in  the  MST  (e,f). 
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B.  COMPARISON  OF  FKP-STYLE  AND  MST  CONSTRUCTIONS 

To  examine  the  price  of  anarchy,  we  compare  the  cost  of  the  equilibrium  network 
to  the  MST.  We  repeat  the  rewiring  experiments  10,000  times,  generating  the  initial 
network,  rewiring  it  to  equilibrium,  and  then  determining  the  MST.  Table  1  shows  the 
network  costs  associated  with  each  type  of  network  and  compares  them  to  the  cost  of  the 
MST.  For  networks  with  n  =  100,  the  initial  network  is  46.5%  greater  than  the  MST. 
Flowever,  once  the  network  reaches  equilibrium,  the  costs  decrease  substantially.  The 
network  formed  from  the  sequentially  ordered  method  is  14.2%  greater  than  the  MST  and 
the  network  from  the  randomized  method  is  15.0%  greater  than  the  MST.  The  difference 
between  the  equilibrium  network  costs  from  the  sequential  rewiring  method  and  the 
randomized  rewiring  method  is  not  statistically  significant.  Both  the  sequential  rewiring 
and  the  random  rewiring  methods  produced  equilibrium  networks  with  the  same  costs  in 
26%  of  the  experiments. 


Network  type 

Average 

Cost 

Standard 

error 

Ratio  to  cost 
of  the  MST 

Initial  FKP-style 

19.791 

0.874 

1.465 

Sequential  rewiring 

15.422 

0.609 

1.142 

Random  rewiring 

15.533 

0.668 

1.150 

MST 

13.506 

0.426 

1.000 

Table  1.  Cost  comparison  of  initial  network,  equilibrium  network  and  MST  for  n  =100. 


Results  based  on  10000  trials  of  networks  with  n  =  100.  The  difference  between  the  costs 
of  the  sequentially  rewired  equilibrium  network  and  the  randomly  rewired  equilibrium 
network  is  not  statistically  significant. 


The  results  indicate  that  either  method  of  local  rewiring  can  improve  the  total  cost 
significantly,  but  the  equilibrium  value  is  still  considerably  worse  than  the  MST.  The 
total  network  cost  is  approximately  15%  greater  than  that  of  the  MST,  but  in  none  of  the 
10000  trials  is  the  equilibrium  network  equal  to  the  MST.  This  leads  to  the  question:  To 
what  extent,  if  any,  is  the  method  that  forms  the  locally  optimized  network  comparable  to 
the  methods  that  construct  the  MST?  Since  this  model  does  not  achieve  the  MST,  we 
explore  what  might  be  preventing  it  from  doing  so. 
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The  first  issue  is  precedence  in  the  formation  process  itself.  The  experiments 
demonstrate  that  the  order  in  which  the  nodes  connect  or  rewire  their  connections 
produces  different  networks.  However,  it  is  conceivable  that  nodes  could  be  introduced 
into  a  network  in  an  order  that  would  result  in  the  model  producing  the  MST.  Figure  12a 
depicts  a  FKP-style  network  in  equilibrium  and  Figure  12b  is  the  MST.  It  is  relatively 
easy  to  envision  an  ordering  of  nodes  that  would  result  in  this  MST.  Since  it  is  a  tree, 
imagine  starting  with  the  root  and  moving  down  the  branches.  For  example,  assume  node 
n  1  is  the  first  node  in  the  network.  Because  the  connection  costs  for  the  FKP-style  model 
are  based  on  Euclidean  distance,  the  second  node  introduced  could  be  «10,  «18  or  nA 
(any  node  with  an  arc  to  n  1  in  the  MST).  If  n  1 0  were  the  second  node,  then  the  third 
node  could  be  nA,  «18  or  «19,  and  so  on.  However,  if  n9  were  the  second  node,  it  would 
have  to  attach  to  n  1 ,  the  only  node  in  the  network,  and  the  final  network  would  not  be  the 
MST  (although  it  may  appear  if  the  network  were  allowed  to  attain  equilibrium).  As  long 
as  the  order  the  nodes  are  introduced  in  the  network  follows  the  order  in  the  MST,  the 
model  would  result  in  the  MST. 


Figure  12.  A  MST  for  a  network  with  n  =  20. 

A  second  reason  that  the  rewiring  heuristic  may  not  induce  the  MST  is  that  it 
could  be  too  restrictive.  Nodes  evaluate  only  one  arc  at  a  time,  and  can  only  consider 
rewiring  with  an  arc  in  the  cut  produced  when  it  removes  the  arc  it  is  reevaluating.  This 


38 


constraint  is  necessary  to  ensure  a  connected  network.  However,  there  could  potentially 
be  a  combination  of  two  (or  more)  arcs  that  could  be  rewired  simultaneously  to  achieve 
the  MST. 

The  other  restriction  is  that  a  node  can  also  only  rewire  the  arc  it  formed  when 
joining  the  network,  and  not  any  of  the  arcs  from  nodes  may  have  connected  to  it  later  on. 
This  might  also  be  preventing  the  model  from  developing  the  MST.  This  case  can  be 
seen  in  the  networks  in  Figure  13.  Figure  13a  is  the  network  that  is  initially  formed.  In 
this  network,  n2  has  to  attach  to  n  1,  so  the  (n2,nl)  arc  is  the  only  one  n2  is  able  to 
reevaluate.  Node  n\ 6  attached  to  n  15  so  it  is  only  able  to  evaluate  the  {n\6,n\ 5)  arc. 
When  the  network  is  in  equilibrium  (Figure  13b),  n2  rewired  its  arc  to  (n2,nli)  and  the 
(n\6,nl5)  arc  becomes  the  («16,«17) .  Both  of  these  changes  occurred  because  the  new 
arcs  are  shorter.  In  this  method,  the  (n2,n\6)  arc,  which  is  in  the  MST  (Figure  12),  will 
never  result.  However,  if  n2  had  the  ability  to  evaluate  all  its  arcs,  the  (n2,n\2)  would 
become  the  («2,«16)  arc,  which  is  in  the  MST. 


a.  b. 

Figure  13.  FKP-style  network  with  n  =20  nodes. 
The  initially  constructed  network  (a)  and  its  equilibrium  network  (b). 
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C.  DISCUSSION 

This  work  primarily  focuses  on  incremental  network  construction  based  on  local, 
myopic  decisions.  The  major  difference  between  this  model  and  the  FKP  model  is  that 
we  removed  the  tradeoff  aspect  by  not  using  the  centrality  term  in  the  local  objective 
function.  We  simplified  the  objective  function  so  we  can  compare  the  results  of  the 
formed  networks  to  the  optimal  network,  the  MST,  to  explore  the  price  of  anarchy. 

We  demonstrate  that  precedence  plays  a  role  in  the  network  formation,  producing 
initial  networks  with  different  costs.  We  also  demonstrate  that  the  sequential  rewiring 
process  and  the  random  rewiring  process  substantially  improve  the  network  cost,  but  that 
there  is  no  statistical  difference  between  the  two  methods. 

The  work  in  this  thesis  lays  the  groundwork  for  more  complex  numerical 
experiments  and  deeper  analyses.  The  heuristic  model  with  rewiring  improves  the  total 
network  cost  substantially,  but  does  not  achieve  the  MST.  The  extent  to  which  the 
constraints  of  the  model  prevent  it  from  obtaining  the  optimal  result  is  unclear.  To 
explore  this,  the  model  could  be  altered  so  nodes  have  the  option  to  rewire  all  of  its 
incident  arcs,  not  just  the  one  it  initially  formed. 

The  model  also  has  room  for  expansion.  We  did  not  use  a  measure  of  centrality 
in  the  objective  equation,  but  this  can  be  added  to  the  model  to  explore  the  effects  on  the 
network  structure  due  to  trade-offs  between  centrality  and  distance.  We  also  primarily 
focus  on  fonning  networks  with  a  tree  structure,  whereas  many  real-world  networks 
contain  cycles.  This  model  can  be  altered  to  allow  nodes  to  fonn  more  than  one  arc  when 
it  joins  a  network. 
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V.  CONCLUSIONS  AND  FUTURE  WORK 


Understanding  the  drivers  of  complex  network  formation  is  nontrivial.  In  this 
thesis,  we  focused  on  models  of  network  formation  with  emphasis  on  both  centralized 
and  localized  algorithms  for  solving  the  minimum  spanning  tree  problem.  We  developed 
a  local,  heuristic  model  based  on  a  FKP-style  construction,  which  uses  rewiring  to 
produce  networks  in  equilibrium.  We  examined  the  price  of  anarchy  of  these  networks 
due  to  myopic  node  behavior.  Although  the  rewired  heuristic  model  does  not  always 
produce  the  MST,  we  demonstrate  that  there  exist  orderings  of  nodes  that  can  obtain  it. 
This  leads  to  the  question:  Is  there  an  interpretation  of  the  local,  myopic  decision 
process  of  the  FKP-style  construction  that  lends  itself  to  an  equivalent  global 
optimization  problem?  If  the  answer  is  affirmative,  then  the  local  and  global  methods 
would  both  provide  the  optimal  solution  and  the  price  of  anarchy  would  be  zero.  This 
could  have  significant  implications  for  the  formation  of  real  network  systems  when 
global  information  and  central  decision  processes  are  not  possible. 

Is  there  evidence  to  suggest  that  such  an  interpretation  is  possible?  Here,  we 
appeal  to  the  notion  of  duality  in  network  optimization  problems  and  note  that  there  is  a 
considerable  literature  in  the  use  of  duality  arguments  for  the  development  of 
decentralized  algorithms  (see  Bertsekas  and  Tsitsiklis,  1997,  for  an  in-depth  treatment). 

The  Internet  is  an  example  where  duality  arguments  have  recently  enhanced  our 
understanding  of  complex  network  behavior.  The  Transmission  Control  Protocol  (TCP) 
is  fundamental  to  the  operation  of  the  Internet.  It  guarantees  end-to-end  delivery  of  data 
packets  by  recognizing  and  retransmitting  packets  that  are  lost,  and  it  also  controls  the 
rate  at  which  individual  computers  inject  packets  into  the  network.  Like  most  of  the 
protocols  used  in  the  Internet,  TCP  was  developed  in  an  ad  hoc  manner,  based  on 
engineering  intuition  and  trial-and-error  more  than  mathematical  theory.  To  researchers 
in  the  network  science  community,  the  behavior  of  TCP  seemed  like  a  case  of  self¬ 
organization  (Veres  and  Boda,  2000).  However,  research  over  the  last  decade  has  shown 
that  TCP  and  its  complementary  protocol  Active  Queue  Management  (or  AQM,  which 

runs  in  routers  to  manage  the  size  of  their  limited  buffers)  work  together  as  a  primal-dual 
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algorithm  to  solve  a  global  resource  allocation  problem  in  a  decentralized  and 
asynchronous  manner  (Kelly,  Mauloo,  and  Tan,  1998;  Low  2003).  This  type  of  analysis 
is  not  only  bringing  greater  understanding  to  the  way  that  the  existing  Internet  works 
(Srikant,  2004),  but  it  is  also  helping  to  influence  the  design  of  future  network  protocols 
(Chiang,  Low,  Calderbank  and  Doyle,  2007). 

While  there  remains  considerable  work  to  understand  the  forces  governing 
complex  network  behavior,  it  is  clear  that  optimization  is  an  important  tool  for  exploring 
the  tradeoffs  at  work  in  network  formation.  Identifying  the  precise  mechanisms  at  work 
in  specific  applications,  as  well  as  how  to  improve  them,  will  be  a  topic  of  future 
research. 
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